F/8  5/9 


AD-A097  353 


UNCLASSIFIED 


RESEARCH  APPLICATIONS  INC  ROCKVILLE  NO 
AO APT  I VE  TESTING  WITHOUT  A  COMPUTER. (U> 
MAR  81  D  FRIEDMAN.  A  STEZNBER8*  M  J  REE 


F33615-79-C-0018 


\l  lim  Tli  HO  <><>  „ 


AIR  FORCE  0 


LEVEL 


\l)\lrIIM  I  Is  I  |\l.  WIIIIUI  I  \  toMI’l  II M 


I  lax  ill  I  ririlmafi 

Mm. i  Siriuliff-jr 

I  77<»  I  .•'-I  J<  ll<  rton  s i r » •  * •  i 
K.m  kv.!l< \|.,n  I.iimI  JOH'.J 


Miilrolm  JtiiiM’s  K<« 


>1  \  \ro\\  I  K  \\|>  IM  KsUNM  I  DIN  IMON 

lirooks  \ir  hirrr  Kasr.  I  r\a*» 


Mau  l.  I  <18  I 


I  iiial  K<‘|mrt 


dt:^ 

EL  EG 

ap;<  o  c  isei 


|.. I  | .  *  J  |  •  1 1 .  i.|.  1-.  1 1 1 «  I  |  1 1  ■  1 1 1  la  1 1>  f  1 1 1 1 1 1 1 1 1  •  •  || 


LABORATORY 


AIR  FORCE  SYSTEMS  COMMAND 

BROOKS  AIR  FORCE  BASE.TEXAS  78235  .  ^  ~ 

81  4  6  013 


NOTH  I 


W  hrn  I  s  iJi'.im m*f-.  -[trrfln'jfhm-.  or  olhrr  iJ.iI.i  an*  n-rj  for  hi m  fniffto-r  olhrr 

1 1  |;i  1 1  I  l|r  I  I  III  Irl  v  M-I.lft'il  l.ovmillli'll!  |ifoi  llli'im'lll  o|irrat|oil.  llir  <  »ov  rlllliUMl  thrlrhv  iilMir- 
iio  n  -iion-ihiht  v  tmr  .m\  obligation  w  ha  I -or  v  rr.  .itnl  I  In*  1  «n  I  tli.it  tin-  <  .nv rrnimnl  mav  liavi 
fonnulatrJ.  I urni-hrif.  or  in  anv  wav  -ii|i|ilinf  (in'  -.ml  ilrawin^-.  -|h*i  ilnalmii-.  or  ollnr  il.it. i 
i-  not  lo  lir  ri*«i*  rili'il  b\  im  pin  a  I  ion  or  otln-ru  i-r.  a-  in  am  in.innrr  In  rn-iny  tin-  liohlrr  or  anv 
olln  r  iirr-on  or  romoralion.  or  *  onvrv  m«;  anv  ri^hl-  or  |M'rim--ion  to  tnanul.n tun*.  u-r.  or  -ell 
anv  iMlrnlol  invi  ntion  that  mav  in  anv  wav  hr  rrlalril  thrrrto 

I  hi-  final  ri  jiort  v»a>  -ubmitlrrf  hv  Kr-ran  h  \|i|ilnation-.  Im  oi|ior.ilnj.  I  ,  ,h  I  a-l  hl|rr-on 
Mrrrl.  Korhv  illr  M  a  rv  l.i  mi  urnlrr  t  oiilr.nl  I  T{(il.»- ,  'M  -00 1  K.  Ih'ojri  I  II  IK.  with  tin- 

Maimnurr  ami  |Vr-oimrl  Ihvj-ion.  \ir  I  on  r  llinnan  Kr-oimr-  I  ahoralorv  I  \ls<  I.  Krook- 
\ir  I  orrr  lhi-r.  I  r\a-  .  Malrolin  Janir-  Krr  w.h  tin*  ^  oiilr.nl  Monitor  lor  I  hr  I  .ahoralorv . 

1  hi-  rriMir!  ha-  hrrn  frvirwril  hv  thr  (Mini*  ol  I’nhln  \llair-  <P\)  ami  i-  rrlra-ahlr  to  I  hr 
National  I'vhnnal  Inlnrmalion  >rrv  irr  (N||s)  \i  \  |  |s  h  will  hr  avaiiahh'  to  tin*  ymrial 
fmhln  .  iticlmhn"  |orrij;n  nation-. 

I  hi-  tri  bun  al  rrnorl  ha-  hrrn  rrvirwnl  ami  i-  a | »| >r« >\  •*<  1  lor  }>iihln  at  ion 


N\Nt  ^  t.l  INN.  I  r«  him  al  him  lor 
Manjiowi  r  ami  IVr-oiinrl  Ihvi-ion 


IIONM  II  W  II  Kin.  <  olonrl.  I  Ml 
*  oimnamh  r 


EPORT  DOCUMENTATION  PAGE 


KK AD  INSTRUCTIONS 
HKI-ORK  (  OMPI.KTINC.  FORM 


'j  GOVT  ACCESSION  NO  ]  3  RE  C  I P  l  E  N  T  ‘  5  CATALOG  NUMBER 


ftp- 


4  TUl.  F  and  Subtitle) 


_  M»\l*ll\ I  II  STIM.  w  mini  I  \  i  i »\1IM  I  I  II  . 


5  TYl  L  OF  RtPO.RT  A  P^fliOO  COVERED 

i 


I  IM..I  V 


V 


6  PERFORMING  OR  G  RE  PORT  NUMBER 


•  AUTHOR*' 

i 

Ii.imiII  I  rii-ilm.in 
M .it.  ttlm  l.iint  - 


CONTRACT  OR  grant  numbers 


_  I  i.W»  I 7‘M  -iHi  I H 


PERFORM  NO  ^RGANITtAT'^n  NAME  and  ADDRESS 

Ki*-»\ie«  It  \iiiiln  alum*.  I m  nmni.ilni 
l  .  .1*  I  .1*1  .!«•!  Irr-uil 

Him  kv  til'  M.irxl.lllil  Jll.H  *_! 


to  program  element  projFCT  task 

AREA  4  WORK  U  N  i  T  N.lMBEPS 


m  mu 


r..-N’  WO  l  L.  INC.  'JFFl-'F.  name  and  ADORE  ss 

IIV  Mi  I  "r« •••  lliiman  Hr-mirt  lalnir;»tor\  I  M  s<  ) 
liftlok'  \ll  I  I  *r*  I 


>4  M^N!  T  ~RiN  .»  A  sen",  v  NAME  A  AQDRF  SO'  if  dllfrranl  Irnm  (  >ntfdling  Dili . 


M.IHpMMrl  .Mill  iVl'Miunrl  lh\|*l«in 
\ 1 1  I  i if « i ■  1 1 1 1 1 1 i.t  m  |{i*«oiir<  r-  I  ,ilnir.ilor\ 
li? •  m >k -  \ir  I  i»rr»*  |la*r.  7HJV» 


t  S  SECURITY  ClASS  of  CiF»  fipp.>f( 
I  m  l;i'*i I  ird 


1S«  DECLASSIFICATION  DOWNGRADING 
SCHEDULE 


fi  Ot  Mi  0  |TiON  STATEMENT  f.>f  fh>*  Report- 


\|i|*r<*\«il  lor  | mi l»l i«  di-tnlml  nm  unlinnlrd 


r.‘iST  Rl  Bu  T  ion  StATEmENt  fhe  Ahsfr*''*  miff'd  In  Mb'  k  JO.  <■  diflatfnl  (rnrr.  Report 


•suPP'..EMF.NtAPy  NOrFS 


wf  »  w'1  <  •  •nririue  >f  reverse  aic  If  it  fsmrv  and  nlfitllx  tv  him  k  nnnihrr  ■ 

I »  ■»  t  .M1«l  m*  a-iirrmrnl  \*\  \h 

l.iiciii  ii. u(  »<•!«•«  in.u 

,nl.i|illVt  Ir-liiii* 

| » -  \ «  lutnii  i  i  m  * 


AB'.tra  .  T  <  •ttnti’tf  -»n  MM-wif  mdf  II  nf>fit.ar\  and  tdetitilx  hv  h  number' 


llut  t  | »r« »r « it \  f M* -  til  |i.i|H-r-aml  |M'n<  il  lia*rd  adaptin'  ir*i*  vm  it  . \ «- 1 « t| »*-<t  rrhnrd  .uni  d«\  t-lnpt-d  m 
-uHuitiil  <|iM(ihiii*  fur  <h(  hi  mi  i*t  I'itl  h  mi  In  urtMiji*  • .  f  furls  >«if*|«*i  l-  I  wo  aptil  ud>  .in-.i*  vu it  •in  «•«(  in  t  ,n  Ii 
|ir •  it \  |i«  I  In  -1  v%t-r*’  ^  ttril  K i n •  vs  Irdgr  .uni  Vrtdtmriti  lir.i^onini:  S  lul.il  nt  711  li.i*n  Wimn  lift  ruiN  wi  n- 
.nimuii'if  n  il  I  In-  prolnlvpr*  in  liotli  aplitudr  ana*  a*  vsrll  a*  Iradiimnal  papri -.md  prm  tl  ir*i*  nl  Imlli  arra* 
\ddr*mn.i)lt  mlr-Jimur  *t  r»rr*  tin  llic  \nnt  tl  I  nn  *■»  Oualifn  afmn  |  c*i  I  \  l  U  I  )  ,»ml  *«  mr-  |«»r  -»•!»•«  hn  >  <Hnpi»*iir» 
*  Mi  win-  axailalilf  1 1  n  «mi  It  -nlijrt  l  Ii  v\.i-  Itiiimi  dial  dir  .nla  |»t » \  ••  tr*i*  t  iinrl.itrd  Ih^IiIn  Midi  1 1  k  *  -  naihrd  |»a|»rr 
,iimI  fH  in  il  it -i*  .uni  t  tirrrlalMMi*  willi  \Ht)'l  and  llir  \U  wrrr  alumi  dir  *.imr  Inr  ir.nliinni.il  ir*i*  and  ada|>ti\r 


t  o»m 
t  j  an  n 


1473 


-  7  “V  C  !  S 


U«ITY  CL  ASStFlCATlON  Of  T**tS  P  AGE(*h»n  Pmtm  F.ntermd) 


|r«'in  JIM  . 


*>lr-U,  l  lir  lr-1-  sIiomim!  .1  l;« rjn*  tiil\.i»lag‘‘  in  lunr  nt  ailmim-iralmii  raiif'inj'  Iroin  -aviiij'-  ni  mir-lluril  lo 

ow-hall  li  1-  fh.if  .1  lull  ailapfiw  #«•- f  lullrn  l*a-«'i|  <m  (In-  [>rofnJ\|M—  woijM  allow  lor  lln-  .oMilnoi  o! 

.iImiiiI  -in  nion-  j|ilihi(li’  arra-.  I  In-  miiM  |irn\n|r  liHlrr  mra-iininmil  l>\  nialilm^  rimrr  ilala  in  In*  •  oIIi*i»m|  mi 

r,ii  |i  rx.muurr 


PR tr ACE 


This  research  was  conducted  under  I L I R00 17,  Adaptive  Testing  Without  A 
Computer. 

The  authors  wich  to  express  their  appreciation  to  the  testing  detachment 
of  the  Manpower  and  Personnel  Division  at  Lackland  AFB,  Texas,  the 
Technical  Service  Division  and  James  B.  Sympson  for  contributions  to 
this  effort. 


“A 


I.  INTRODUCTION 

Item  roponse  theory,  often  referred  to  as  latent-trait  theory,  has  provided 
the  tools  for  solving  t ho  problem  of  tailoring  a  test  to  the  individual. 
Traditionally,  the  same  test  is  given  to  all  individuals  regardless  of  the 
ability  level  of  the  individual  and  th  difficulty  level  of  the  test.  This 
mismatch  may  result  in  decreased  precision  of  measurement  which  may,  in  turn, 
lead  to  mi  sc  lass i fi cation ,  errors  of  selection,  poor  use  of  scarce  resources 
and  selection  of  individuals  who  are  ill-equipped  ‘o  perform  the  tasks  at  nand. 

The  development  of  latent-trait  theory  (see  lord  A  Novick,  19b.''.}  has  been  the 
latest  in  a  constant  trend  toward  making  human  aptitude  measurement  more 
precise  by  adaptiny  tests  to  examinees. 

As  early  as  the  beginning  of  the  twentieth  century,  Alfred  Binet  (see  Peterson, 
1926)  developed  adaptive  tests  for  educational  screening.  The  success  of  the 
group-administered  tests  developed  during  the  first  World  War,  coupled  with  the 
long  administration  time  of  the  Binet  tests,  changed  the  course  of  test  develop¬ 
ment  to  efforts  aimed  at  producing  the  more  economical  paper-and-penci 1  group- 
administered  non-adaptive  measurements  which  have  become  the  standard. 

The  advent  of  relatively  inexpensive  and  portable  computers  has  made  feasible 
computer-directed  adaptive  testing.  In  the  last  decade,  numerous  studies  have 
been  undertaken  in  an  attempt  to  accomplish  adaptive  measurement  using 
computers  (see  Weiss,  1977). 

Computers,  however,  are  prone  to  failures  at  unpredictable  times  and  are  still 
more  expensive  than  paper-and-penci 1  media.  This  effort,  therefore,  was 
designed  to  investigate  the  feasibility  of  developing  sophisticated  adaptive 
tests  which  do  not  rely  on  computer  administration  techniques.  Such  tests 
would  eliminate  the  need  for  costly  machines,  capture  the  advantages  of  latent- 
trait  theory,  and  be  as  portable  as  ordinary  test  booklets. 

II.  METHOD 

The  Adaptive  Test 

For  this  effort,  an  adaptive  test  was  defined  as  a  test  composed  of  several 
scorable  items  which  were  administered  sequentially,  so  that  the  item  presented 
was  based  on  the  results  of  the  preceding  question,  or  on  the  results  of  all 
the  preceding  questions.  In  an  adaptive  testing  environment,  the  examinee  is 
routed  from  item  to  item  so  that  not.  all  examinees  necessarily  answer  all 
questions  nor  necessarily  the  same  number  of  questions  (McBride,  1977). 

I  tom  Pools 

I  wo  adaptive  (imtent.  arise,,  Word  Knowledge  (Wh)  and  Arithmetic  Reasoning  (AR), 

.■e  re  used  for  the  adaptive  tests.  Using  the  maximum  lilelihood  procedure 
described  by  W inyersky  and  lord  (1971),  the  test  items  for  these  ((intent  areas 
no!  fern  r  ,il  ibrat*-')  on  a  sample  of  approximately  1  ,f>un  Air  Tore  e  recruits.  Each 
ability  area  was  calibrated  separately  using  the  three-parameter  loqistu. 


1 


i 't > •  1  * ■  i  1 r ■I'i'.i.iin,  I''*'  i  .  1  *  "in',  whi  ti  had  p  ir.jmof  ers  out  <•*  r , in < j. •  w <-r>-  1< ■  1  < •  t  * J 

*  »< mm  Uie  ['ii"';,  leaving  ,i  ,"t  of  item',  which  wore  .* •  ;  i  jpriut.e  tor  tne  ♦  #*', t i n«j 
t  U  S  k  . 

lYot.Ot  V [)(?  bevel  Opmeilt 

five  pro  t  o  t  y  ;it*  s  f  r.i  t  i‘'i  i  es  fur  adaptive  testing  wort.*  [3  n  j ;  *rl ,  - 1  > ,  <  1  the""  n‘ 

those  were  selec  tod  tor  tryout  on  small  samples  of  Air  lone  baric  i<><  ruits 
to  refine  imii  edures  and  t  ei  nni  t  jnc*c, .  the  prototypes  wen1  designed  so  that 
rue  tti"  initial  inst  rur  t  ions  wore  given,  t.he  sub. jest  would  not.  roqu  i  re 
further  assistant. e  to  complete  t.hr  test. 

•voting  t.e  .t"  tallowed  (>y  a  'men suronerit  tost",  was  used  in  m  ft  prtitot 
These  procedures  resulted  in  a  two-, 'age  test  protocol.  Iwu  inethnds  of  routi' 
the  subject  f run  item  to  iter  were  used.  Tor  one  method,  all  subjects 
answered  all  items  in  the  first  stage  of  the  test.  Depending  on  their  pertorr 
ante  on  the  first  stage,  they  were  routed  fo  one  of  five  second-stage  tests. 

lor  Mi"  second  routing  method,  all  subjects  started  with  the  first  item  in 
*'i"  first  stage  of  the  test.  Depending  on  whether  their  response  was  correct 
or  incorrect,  subjects  were  routed  to  a  more  or  1  <  , s  difficult,  iter.  This 
same  pnu  edure  was  followed  for-  each  subsequent,  item  in  the  first  stage. 

The  sequences  of  items  answered  determined  t.he  level  of  the  test  to  be 
taken  at  tne  second  stage. 

Prototype  i 

!n  Prototype  ;  (PI),  each  examinee  use"  a  cardboard  box  containing  fr>0 
i  . ( x  IP. 7')  c.in  (T  /  ri-  inch)  iter  cargs.  The  test  items  tor  t  ho  t.wo 


•j\.  4**S4'v  '.v 

1  ,  1  ,  x; 

*  i  i  ►  y 

i  1. 1  "  t  •  ai  t  ho 

a  ;  f  r„. 

.»*  1  <1  1  •  :  .  . 

1  he  test  s  wer  e  t. o  1  or-code 
"a.  r  '.'iPi  t.  . 

(.1 ; 

a  nd 

-1  ::rlor  f 

)  ;  re  yen  t  lav 

»  till' I 

disarrangement ,  the  cards  were  held 

i  n 

a  box 

/  two  r'"i 

s  t.hr" 

1 1  t  h* 

•fl  Jtjn 

tioles  in 

t.he  cards  and  anchored  by 

s  t.o 

Pliers 

*■  »*M 

i  «  1 

'  .  .  "I 

’  1  c: 

irds  wore 

jTj 

O 

C 

box  by 

tV'ii’in 

t 

V*»i> 

:n  of 

Urn  box 

was  such  ttiat.  when  necessary, 

worn , 

...t  IcjVJ, 

i  1  obs 

‘ 

il>  tr*st 

s  or  i to 

ms  could  be  easily  replaced 

by 

the 

tor . 

•'■r  *m  -  v 

‘  t  hi 

r.  ,  !'  *  i  ‘  r-\ 

>,  t.hi- 

>  examine 

es  were  provided  with  a  one 

-l'a 

go. 

1 1  i 

1 1 .  •  -.  a  t  •  1 

*  lM'jVilM 

‘  shot 

a  and  a 

separate  one-pao"  instruction 

sheet . 

*  \  i  *■’  ,i  * 

■  *'  "a 

•  h  cl  Fl  '  rV/l 

■r 

•at  corre 

sponded  to  t  tie  individual  s 

utitc’st , 

Ik  1  » ,  I  ’  »  ♦ 

f)  1 

i  i  r 1 1.  t  ho 

■r  of  gue 

st ions  and  response  options 

Th" 

f  -  . ..  4  r 

/4<\f 

i ,  >f  it] 

i  to  cac. 

h  subte , *  and  was  used  by  the 

exam i nee 

-  !  *  - 4  *  *  r  t  i 

t  .  •  *  f\i> 

*  ! ;  •  ri  ‘ ,  c  J  1  ♦ 

Miu*rj 4 

abt  es  t 

to  i  "  t  a t  i’ti  . 

t  \\ 

*  *  <1  *  1  • 

m  rieiirj.i 

1  Vi  ■  1  ‘ 

provided 

as  part  of  t  no  pai 1  age  of 

nut ; 

er  l  a  1  s  . 

■  cl  !.• 

y  1  m  l.i  * 

;  hv.'i 

■  / ,  * 

aid  ' he 

a  Ir'  i ci  i  s  f  ra  tor  in  t  tic'  i ns  t  r 

sc  t 

ion  of 

•  ••  •  -  r  ' 

M 

‘  kx ,  ,  .  , 

4  * 

1  i  e  i  a  a  t  o 

fy;.e.  and  a  pen  with  wutor- 

bus 

ed  ink 

e  wi 

4  f  t  hr 

1  v l  aja  1 

d  i  Sp  1 

lay  wre 

provided. 

Prototype  II 


Prototype  II  (I'll)  consisted  of  a  set  nf  two  question  booklets  for  each  subtest. 

1  he  questions  for  the  first  part  of  each  subtest  were  presented  in  a  small, 
spiral-bound  booklet,  which  contained  tabbed  7.6?  x  12.70  cm  (.7  x  5-inch)  cards 
and  cover  pages.  The  questions  for  the  second  part  of  the  subtest  wore  printed 
in  a  booklet  21.52  x  27.94  cm  (8  1/2  x  11  inches).  The  examinees  were  referred 
to  the  appropriate  measurement  test  based  on  t he  directions  provided  on  a 
separate  one-page  instruction  sheet,  each  examinee  used  a  total  of  two  sets  of 
question  booklets  and  instruction  sheets  for  each  administration. 

T tie  answer  sheet  for  PI  I  was  scannable  and  had  invisible  numbers  and  marks 
precoded  in  the  response  areas.  T  iio  examinees  used  special  crayons  to  mark 
their  answers.  Use  of  these  crayons  revealed  the  previously  hidden  marks. 

One  27.94  x  43.18  cm  (11  x  17-inch)  answer  page  printed  on  both  sides  of  the 
paper  was  used  for  the  subtest. 

A  n anija 1  was  provided  for  the  administrator  to  explain  the  procedures  to  be 
'•-Unwed  in  I'll.  A  visual  aid  was  provided  to  aid  the  administrator  in 
explaining  the  routing  directions  for  PIT.  The  visual  aid  was  constructed 
to  illustrate  how  the  hidden  marks  were  to  be  revealed  on  the  answer  sheet 
to  respond  to  each  test  item. 

Prototype  III 

For  this  third  prototype  (Pill),  the  questions  were  presented  in  a  21.52  x  27.94 
cm  (8  1/2  x  11-inches)  booklet.  The  responses  were  recorded  by  the  examinees  on 
a  carbonless  transfer  answer-sheet  set.  Each  examinee  used  two  question  booklets 
and  carbonless  transfer  answer-sheet  sets.  Each  answer-sheet  set  was  specifically 
designed  to  correspond  to  a  particular  subtest. 


A  carbonless  transfer  answer-sheet  set  consisted  of  two  pages.  The  top  page 
was  a  machine-scannable  answer  sheet  that  was  spot. -glued  to  a  second  sheet 
of  paper.  The  reverse  side  of  the  machine-scannable  answer  sheet  was  covered 
with  a  block  pattern  to  inhibit  reading  of  the  second  sheet,  and  was  treated 
so  that  markings  made  on  the  answer  sheet  were  transferred  to  the  second 
page  of  the  set.  The  second  page  provided  the  examinees  with  instructions 
that  routed  them  to  the  appropriate  measurement  test  based  on  their  responses 
to  the  first  part  of  the  test. 

An  instruction  manual  for  Pill  was  provided  to  the  administrator.  Two  visual 
nids  were  used  by  the  administrator  to  explain  the  routing  scheme  for  Pill. 
Each  visual  aid  corresponded  to  one  page  of  the  answer-sheet  sot.  A  pen 
with  water-based  ink  was  provided  for  use  by  the  administrator  with  the  visual 
<i  i  ds . 


Pouting  Test  Development 

p.„  routing  test  for  Prototypes  I  and  II  (PI  and  PH)  directed  the  examinee 
from  item  to  item  depending  on  the  response  to  t fie  previous  item.  A  maximum 


3 


information  i  tein-se  lection  procedure  was  used  for  t  he',  e  two  mutiny  tests 
(Sympson,  1977).  Items  which  maximized  the  i tom- i nform.it ion  function 
(Birnbaum,  1968)  at  the  estimated  ability  level,  u,  were  selected  after  each 
item  was  answered.  Fourteen  items  were  available  in  each  of  these  tests. 
Figure  1  shows  the  possible  paths  through  the  items. 

!  t  em 


Figure  1.  Paths  through  the  routing  tests  for  PI  and  I’ll.  (Numbers  indicate 

items;  and  +  and  -  indicate  correct  and  incorrect  responses,  respecti  vely .) 

The  routing  test  for  Prototype  III  (Pill)  was  a  short:  peaked  measure  of 
ability.  There  were  eight  items  used  in  the  Arithmetic  Reasoning  test  and 
10  items  used  in  the  Word  Knowledge  test. 

Design  of  Administration  Instructions 

The  administration  instructions  were  prepared  as  integral  parts  of  the  proto¬ 
types.  The  test  administrators  were  only  to  be  available  to  reinforce  these 
instructions  or  to  answer  appropriate  questions. 

The  instructions  were  tried  out  with  a  number  of  volunteers  whose  ages  ranged 
from  nine  years  through  adult  and  whose  educational  levels  ranged  from  fourth 
grade  through  graduate  school.  On  the  basis  of  those  pro-experimental  trials, 
changes  were  made  to  the  instructions  in  the  prototypes  and  to  the  adminis¬ 
tration  instructions.  Instructions  for  the  practiie  sessions  amt  the  special 
visual  aids  appropriate  to  each  prototype  were  developed  and  refined.  The 
administrators  were  trained  in  the  use  of  these  materials. 

Field  Test 

A  total  of  711  airmen  participated  in  the  field  test.  I ach  took  the  Word 
Knowledge  ( WK )  and  Arithmetic  Reasoning  (AR)  subtests  from  the  Armed  Service's 
Vocational  Aptitude  Battery  (ASVAB),  as  well  as  the  adaptive  WK  and  AR  tests. 

In  addition,  enlistment  qualification  scores  (scores  of  record)  on  the 
Mechanical,  Administrative,  General,  and  Electronics  (M,A,G,E)  composites  of 


A 


'-v 


the  ASVAB,  as  well  as  the  composite  known  as  the  Armed  Forces  Qualification 
Test  (AFQF),  were  available  for  every  subject.  Other  demographic  data  were 
also  collected. 

Instructional  manuals  were  prepared  for  use  by  the  administrators  in  assign¬ 
ment  of  subjects  to  prototype  and  subtest.  At  least  40  subjects  were  tested 
at  each  session.  If  the  administrators  encountered  any  problems  at  any  of 
the  sessions,  they  were  asked  to  record  these  problems  and  resolutions  in 
the  manuals  for  review  by  the  contractor.  The  initial  day  of  administration 
was  observed  by  the  researchers. 

For  the  field  tryout  of  the  prototypes,  a  practice  test  and  an  actual  test 
were  administered.  Half  of  the  subjects  were  randomly  assigned  to  the  WK 
adaptive  tests  and  half  were  assigned  the  AR  adaptive  tests  for  the  practice 
test,  for  the  actual  testing  session  the  assignment  of  subjects  to  an 
adaptive  test  were  reversed.  Those  subjects  who  were  assigned  the  WK  adaptive 
tost  for  the  practice  session  took  the  AR  adaptive  test  during  the  actual 
testing  session  and  vice  versa.  Thus,  for  each  testing  session,  tv/o  adaptive 
tests  were  administered  to  each  subject,  one  for  practice  and  one  for  actual 
scoring. 

Ability  estimation  in  the  routing  test  for  PI  and  PII  were  determined  from 
maximum-likelihood  estimates  of  ability  for  each  of  the  32  possible  combinations 
of  right  and  wrong  answers. 

The  routing  test  of  Pill  was  designed  so  that  all  examinees  took  all  items. 

These  items  wore  arranged  within  a  short  band  and  produced  a  peaked-test 
information  function.  The  resultant  ability  estimate  was  used  to  route 
examinees  to  the  appropriate  measurement  test. 

Measurement  Test  Development 

The  measurement  tests  for  PI  and  PII  were  the  same.  The  medium  for  adminis¬ 
tration  of  each  prototype  differed.  The  tests  were  developed  to  provide 
maximum  measurement  pre-.  ision  within  a  relatively  narrow  range.  This  range 
was  determined  by  the  resultant  0  from  the  routing  test.  In  order  to  ensure 
adequate  coverage  of  the  ability  continuum,  the  measurement  test  information 
functions  were  carefully  designed  to  overlap.  Figure  2  represents  the  model. 


Test  I  II  HI  IV  v 


Figure  2.  Overlapping  information  functions  for  measurement  tests. 


5 


'  h>‘  '.,i  ■  t ■!  ;e:i  t  *  i ",  *  *, 

*  <  »' !  .i  n  5  R .  I  ,  i * v,. e  p 

«  V  '  ;  ’  ;.;t  •  ■ 

toe  ,i,i  ' ••  cnt  ' t  - 


fur  I'll!  wi'i'c  (.oust  i  tutod  in  much  the  same  manner  as 
that  cutting  points  were  based  on  the  number  right 
through  (>  show  the  actual  information  functions  for 
for  all  prototypes  for  both  aptitude  areas. 


III.  RESULTS 


.  m;  :■  ,  •  r;  . *  k fur  age  ,nnl  non-adaptive  WK  and  AR  test  scores  v/ere 

'■  :  •  r  ".e  sub  iiH.ts.  Table  1  presents  these  statistics  for  the  entire 

•  ■  <  *  ••  was  male  and  25  percent  female.  Table  ?.  shov/s 

•  i, d  i  ’  i  *  i  scores,  ,  obtained  by  subjects  for  each  prototype. 

■  ,"iv; i,  ■,  Ae  re  umputed  for  all  the  variables.  Tables  3,  4,  and  5  show 
t'.e  n.-r  is'  i  ,i  t  ions  for  all  variables  for  Pi,  P 1 1 ,  and  Pill. 


A  .  test  was  compuNd  (Edwards,  195b)  to  determine  if  there  were  differences 
be* ween  the  correlation  of  the  paper-and-penci 1  tests  with  AFQT  and  the  like- 
named  adaptive  tests  for  AEQT .  In  no  case  were  the  differences  significant 
at  the  predetermined  p  <.05  level. 

The  time  repaired  to  complete  the  adaptive  tests  was  recorded.  A5VAB  admin¬ 
istrative  tines  are  fixed.  Table  6  displays  a  description  of  the  time  required 
to  complete  both  types  of  tests. 

The  subjects  also  were  questioned  as  to  their  perceptions  of  the  adaptive  tests 
as  compared  to  traditional  paper-and-penci 1  tests.  Table  7  presents  a  summary 
of  their  responses. 


IV.  DISCUSSION 

Three  prototype  methods  were  developed  to  test  the  efficacy  of  the  use  of 
paper-and-penci 1  adaptive  tests.  Routing  of  the  examinees  through  the  test 
was  accomplished  by  one  of  two  procedures.  In  one  routing  procedure,  the 
examinees  were  routed  from  item  to  item,  depending  on  their  answers  to  pre¬ 
vious  items.  The  sequence  of  items  answered  determined  the  second-stage 
level  of  testing.  The  second  routing  procedure  provided  for  all  the  examinees 
to  answer  the  same  items  in  the  first-stage  test.  The  number  of  correct 
responses  in  the  first  stage  determined  the  second-stage  level  of  testing. 

'wo  subtests  (Arithmetic  Reasoning  and  Word  Knowledge)  were  administered 
to  each  examinee  in  a  counter-balanced  design:  one  for  practice  and  one  for 
the  actual  test.  The  items  for  these  subtests  were  selected  from  item  pools 
provided  by  the  Air  Force  Human  Resources  Laboratory.  ASVAB  subtests  in  the 
same  areas  were  also  administered  to  each  examinee.  Examinees  participated 
as  subjects  for  one  of  three  prototypes.  These  data  were  correlated  with  the 
ASVAB  subtest  score  of  the  same  name,  and  enlistment  qualification  composites 
obtained  from  existing  records. 

The  results  of  the  analyses  showed  that  the  prototype  methods  were  successful. 
There  was  a  high  correlation  between  the  ability  estimates  of  the  examinees 
on  the  subtests  within  each  prototype  and  their  scores  on  corresponding  ASVAB 
subtests.  Significance  tests  indicated  that  these  observed  correlations  did 


V 


Table  1 

Descriptive  Statistics  Ape  and  Test  Scores*  for  Subjects 
(N  -  711) 


Variable 

Mean 

S t  andard 
(icviat  ion 

Skew 

Kurtosis 

A' ^jOcirs 

/ 

20.50 

2. 11 

1 . 10 

.08 

Al  pi 

64 .  !JB 

15.11 

.32 

-  .45 

M 

61.29 

25.05 

-  .Ob 

-  .90 

A 

09 .  7  / 

15.17 

-  .66 

-  .02 

G 

72.66 

15.1 6 

-  .30 

-  .80 

L 

71.72 

17.62 

-  .75 

-  .03 

ASVAB-WK 

22.57 

4.92 

-  .48 

-  .46 

as  v  ar 

13. 'JO 

3.51 

-  .03 

-  .67 

*  Am';;,  v. 

A, 

G  and  f  at 

■  ■  ri-portej  in  j 

aorctnt)  le  epuivalonts 

while  KK 

and 

mi\  <i  ro  ro 

ported  in  nui:;ber  r i tj h t~ score. 

Table  2 

rr i p t.i ve  '•!  if  is  t  its  for'  Word 
Reasoning  Adaptive  iests. 

Knowledge  and 

At'i  timet' ic 

Prototype 

Apti tude 

Mean 

‘i*  at.dat  i 

1  "v  i t  i  on 

N 

i 

A  \{ 

-.2  5 

.  79 

111 

I 

Wb 

#  M  1 

1.07 

73 

1 1 

AR 

-.  1  i 

.  76 

117 

li 

UK 

« « » * 

.87 

120 

ill 

AR 

-.07 

.84 

104 

1 1  i 

WK 

.21 

.85 

67 

Tab  li' 


Inter.  orrelat  ion*.  *  of  "I'.T,  A(|e,  bex,  and  Test  Score 
Var  i  ab  1(’S  f  nr  Prototype  Ill. 


FQT 

.  IS 

NTS** 

.51 

.51 

.55 

.84 

.75 

.68 

.51 

r.T 

-  .03 

\ 

X 

.18 

.05 

.21 

.  14 

.07 

.15 

.14 

IX** 

X 

X 

S\ 

X 

X 

X 

X 

X 

X 

X 

.60 

.06 

l 

.27 

.44 

.46 

.43 

.25 

.73 

.50 

-.10 

X 

.40 

.05 

.35 

.63 

.39 

.32 

.38 

.24 

X 

.36 

.11 

\ 

.63 

.30 

.32 

.50 

.89 

.02 

X 

.73 

.50 

.35 

\ 

.57 

.77 

.53 

.87 

-.10 

X 

.54 

.70 

.32 

.81 

\ 

.42 

.42 

.70 

.06 

X 

.85 

.41 

.40 

.72 

.59 

.32 

.74 

.02 

X 

.54 

.43 

.51 

.76 

.74 

.59 

*!ntri»*s  abnv”  diannna)  are  for  Ar i  t.  timet ic  i’oasonin.)~  adaptive 
t  es  t ,  r  ,  and  those  below  arc  tnr  the  Word  Know  led-ie  adaptive  test 


*No  female  subjects. 

Table  t"> 

Mean  and  Standard  Deviation  of  Test  Administration  Times. 
Test  Mean  T ime  Standard  Deviation 


♦ASVAB  tests  of  AR  and  WK  are  fixed  time 


Responses  to  Adaptive  Versus  Linear  Test 


t'-ple  27  6 

12. S 


not  differ.  The  adaptive  tests  and  the  linear  tests  appear  to  be  measuring 
the  same  aptitude. 

Savings  were  obtained  in  the  average  time  required  to  complete  the  adaptive 
tests  as  compared  to  the  conventional  paper-and-penc i 1  test.  The  Arithmetic 
Reasoning  (AR)  subtest  and  the  Word  Knowledge  (WK)  subtest  represent  the 
item  types  which  usually  require  the  most  and  least,  time  per  item  to  admin¬ 
ister,  respectively.  Reduction  in  AR  time  was  about  66  percent  of  the  usual 
required  time,  while  WK  time  was  reduced  to  less  than  half  the  usual  time. 

A  fully  adaptive  battery  could  be  expected  to  allow  for  an  increase  of  six 
subtests  given  in  the  same  time  required  to  administer  forms  6  and  7  of  the 
ASVAB.  This  would  provide  superior  measurement  by  enabling  more  data  to  be 
collected  on  each  examinee.  Reduction  in  classification  decision  errors 
would  devolve  from  this  additional  information. 

f  xuminees  responses  to  the  questions  on  perceptions  about  tho  u  ,<•  of  adaptive 
testing  prototypes  were  generally  favorable,  as  has  been  found  elsewhere 
(i’restwood  &  Weiss,  1971',).  These  methods  allowed  them  to  be  tested  at  their 
own  level  of  ability  and  to  proceed  at  their  own  rate.  In  addition,  many 
felt  that  this  kind  of  testing  was  easier  than  traditional  testing  because 
there  were  fewer  items  to  answer,  and  the  test  taking  was  less  fatiguing  than 
traditional  methods. 

This  effort  provides  a  successful  demonstration  that  adaptive  testing  can  be 
conducted  without  the  use  of  expensive  computers,  further  exploration  and 
development  with  other  aptitude  areas  and  with  a  traditional  criterion  will 
have  to  bo  accomplished  before  any  long-range  decisions  are  made  about  the 
general  implementation  of  these  methods  in  the  Armed  forces  testing  program. 


N. 


1> 


RLFLRLNCLS 


Birnbaum.  A.  Stijtistic.il  theories  _of  mental  tost  scores.  Lord  and  Novick 
(Lditors)  Read ing,  Massachusetts :  Addison-Wesley,  1968. 

Ldwards,  A.L.  Statistical  methods  for  the  behavioral  sciences.  New  York: 
Rinehart  and  Company,  i958. 

Lord,  r.M.  S  Novick,  M.  Statistical  theories  of  mental  test  scores. 

Reading,  Massachusetts:  Addison-Wesley,  19*68. 

McBride.  J.  A  brief  overview  of  ..daptive  testing.  (Research  Report  77-1) 
Minneapolis:  University  of  Minnesota,  Department  of  Psychology, 

Psychometric  Methods  Program,  March  1977. 

Peterson,  J .  La_r \j  Conceptions:  tests  of  intelligence.  New  York:  World 
Book  Company,  19?6. 

Prestwood,  J.S.  K  Weiss,  D.J.  The^ effects  oj  knowledge  of  results  and 
t . ■  vt  difficulty  on  ability  test* performance  and  psychological*  reactions 
t  i~Te  sting.  (Research  Report  7  8-2]*.  Minneapolis:  University  of  Minnesota, 
Department  of  Psychology,  Psychometric  Methods  Program,  September  1978. 

Synpson,  J.B.  Lst ifiia t ion  of  latent-trait  status  in  adaptive  testing 

procedures.  (Research  Repor t  7*7- 0  Mi n nea pol is :  University  of  Minnesota, 
Department  of  Psychology,  Psychometric  Methods  Program,  March  1977. 

Weiss,  'J.J.  Applications  of  computerized  adaptive  testing.  (Research  Report 
77-1).  Minneapolis:  University  of*M'fnnesota*,  Department  of  Psychology, 
Psychometric  Methods  Program,  March  1977. 

Winyersky,  M.  ,  f.  Lord,  f.  A  computer  program  for  estimating  examinee 
ability  and  item  characterise  ic  curve  pa  nine  ters  when .  tjiere  are  omitted 
ri-sponses.  TResearch  Memorandum  73-/).  Princeton,  N.J.:  EducationaT 
Testing  Service,  1973. 


F/e  s/9 


UNCLASSIFIED 


RESEARCH  APPLICATIONS  INC  ROCKVILLE  MD 
ADAPTIVE  TESTING  WITHOUT  A  COMPUTER. (U) 

MAR  61  D  FRIEDMAN,  A  STEIN8ER6*  M  J  REE  F33615-79-C-0016 

AFHRL-TR-80-66  NL 


J 


■n 


AIR  FORCE  HI  MAN  RESOURCES  LABORATORY 
Brooks  Air  Force  Base.  Te\a>  78235 

Errata 


First 

Author  Till#* 


APHRLi-TR'Iin  7  (AD  Mm  .^kiniiu  Pt.Tftll  III  JTTTr  Uf  WPirUHIga  WPIMH  m  Mr  Force 

Pw.li  i  ill  h  ull  1 1|| 

AFHRI.-TR-80-W)  (AD-AIWT  353)  Friedman  Adaptive  Testing  Without  a  Computer 

Ihie  to  norming  problems  encountered  with  ASVAB  Forms  5.  6,  and  7.  pereenlile  scores  derived  from 
these  lest  forms  are  in  error.  While  the  relative  ranking  of  individuals  by  their  pereenlile  seores  would  not 
be  affected  by  the  norining  errors,  their  absolute  score  values  would  be  different.  Therefore,  descriptive 
statistics  reported  m  the  subject  technical  reports  above  are  erroneous:  other  types  of  analyses  in  the 
report  which  use  ASVAB  percentile  scores  should  be  interpreted  with  caution. 


NANCY  Gl'INN.  Technical  Director 
Manpower  and  Personnel  Division 


